Skip to main content

All Questions

Tagged with
1vote
1answer
35views

scipy bootstrap generates input with inconsistent numbers of samples

I have a dataset of 77 samples, and I am using scipy bootstrap to get a confidence interval to estimate the precision. I am baffled to see that it generates input variables with inconsistent numbers ...
Wouter De Coster's user avatar
0votes
0answers
30views

Agglomerative clustering with min and max cluster size constraints

Are there any python packages that have agglomerative clustering algorithms which have min and max cluster size constraints built in? I found a great package called KMeansConstrained but unfortunately ...
helloimgeorgia's user avatar
0votes
2answers
908views

Solve a non-linear system, in Python, with the GAUSS-NEWTON algorithm? (Jacobian matrix J, etc.)

I would like to solve a non-linear system (which contains the goals of a football team in previous matches) using the Gauss-Netwon algorithm, in order to find the parameter (of frequency) to use as ...
Zollikofen4's user avatar
3votes
2answers
2kviews

Incremental clustering algorithm

I am looking for an incremental clustering algorithm. By incremental I mean an algorithm that builds clusters starting from an initial dataset and that is able to progressively ingest new items/...
Sirion's user avatar
0votes
1answer
183views

Optimize a non-linear function in Python

I am trying to optimize a function using scipy.optimize, but it does not converge. I have a trading strategy with a default stop-loss based on the lowest price over 20 days. I want to optimize this ...
Pier-Olivier Marquis's user avatar
1vote
0answers
19views

Similarity between binary vector with hierarchal structure

I have dataset of binary vectors, where each vector composed from several small vector coming from a different parent category. Each of those categories has a different size e.g. ...
Amit be's user avatar
2votes
0answers
35views

What would be a good randomization environment for data science?

I would like to know if there are any best practices to optimize random environment. Currently I use this simple structure in my config : ...
Al_P's user avatar
3votes
1answer
690views

Smaller alternatives to sklearn that doesn't require scipy?

I am packaging my model for deployment in aws lambda which has a size limit of 250mb for all dependencies. Sklearn, if you include its dependencies of numpy and scipy is a huge package. Are there any ...
coderboi's user avatar
3votes
2answers
164views

How to interpret ANOVA results?

I am trying to identify what attributes are not relevant in my dataset to remove them before fitting a classifier. The target is a categorical variable with three different values. I also have a lot ...
Tlaloc-ES's user avatar
2votes
1answer
5kviews

How do I force specified coefficients in a Linear Regression model to be positive?

Looking for a way to do this in Python. scipy.optimize.nnls forces all coefficients to be positive. Some additional context: I have a data frame with a some explanatory variables and a response ...
Kyle Zengo's user avatar
1vote
1answer
3kviews

How to measure the correlation between categorical variables and a continuous variable

I have the following list of the names of the categorical variables in my dataset: ...
Andros Adrianopolos's user avatar
1vote
1answer
3kviews

What does sklearn's pairwise_distances with metric='correlation' do?

I've put different values into this function and observed the output. But I can't find a predictable pattern in what is being outputed. Then I tried digging through the function itself, but its ...
tim_xyz's user avatar
1vote
1answer
3kviews

Machine learning with sklearn vs. scipy stats

I've created 50 random x and y points (with slope of y = 2x-1). First, I used Linear Regression from sklearn to fit the model onto my dataset where I got a slope of ...
haneulkim's user avatar
2votes
1answer
32views

Normally distribute occurence or counts

I am creating a mock of sales data. One of the columns is salesperson_id where each id can occur more than once (a salesperson can have multiple sales). I want to ...
oikonomiyaki's user avatar
0votes
1answer
44views

Restricting a weight vector (optimization parameter) to be in a certain domain using python ML library linear regression model

Sorry if the title is a bit long, but basically I'm trying to predict values $$ \hat{y}_i \in [-1,1]$$ using a simple model i.e. something like OLS or ridge regression, I'd like to know if anyone ...
darthbungholio's user avatar

153050per page
close